Roaring bitmaps: Implementation of an optimized software library
نویسندگان
چکیده
Compressed bitmap indexes are used in systems such as Git or Oracle to accelerate queries. They represent sets and often support operations such as unions, intersections, differences, and symmetric differences. Several important systems such as Elasticsearch, Apache Spark, Netflix’s Atlas, LinkedIn’s Pivot, Metamarkets’ Druid, Pilosa, Apache Hive, Apache Tez, Microsoft Visual Studio Team Services and Apache Kylin rely on a specific type of compressed bitmap index called Roaring. We present an optimized software library written in C implementing Roaring bitmaps: CRoaring. It benefits from several algorithms designed for the single-instruction-multiple-data (SIMD) instructions available on commodity processors. In particular, we present vectorized algorithms to compute the intersection, union, difference and symmetric difference between arrays. We benchmark the library against a wide range of competitive alternatives, identifying weaknesses and strengths in our software. Our work is available under a liberal open-source license.
منابع مشابه
Consistently faster and smaller compressed bitmaps with Roaring
Compressed bitmaps indexes are used in databases and search engines. A wide range of bitmap compression techniques has been proposed, almost all relying primarily on run-length encoding (RLE), including BBC and WAH. However, on unsorted data, we can get superior performance with a hybrid compression technique that uses both uncompressed bitmaps and packed arrays inside a two-level tree. An inst...
متن کاملBetter bitmap performance with Roaring bitmaps
Bitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus we might prefer compressed bitmap indexes. Following Oracle’s lead, bitmaps are often compressed using run-length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it...
متن کاملMacromodule technology
A development of the parallel, optimal and portable software is a difficult problem. It consists of learning of libraries, programming techniques (such as optimization and paralleling), usage of the libraries and modification of written code. At present there is a set of the optimized libraries for the big number of tasks. But each library is unique, it demands studying and is optimized for spe...
متن کاملA High Performance FPGA-Based Accelerator for BLAS Library Implementation
This paper describes the implementation and the performance analysis of a hardware accelerator for the BLAS library matrix multiplication operation. This accelerator is based on a dual-FPGA board and on an implementation BLAS software library making use of the FPGA-based hardware. In order to evaluate the performance of such a system, we implemented the matrix multiplication operation (BLAS “dg...
متن کاملAn Optimized Matrix Multiplication on ARMv7 Architecture
A sufficiently optimized matrix multiplication on embedded systems can facilitate data processing in high performance mobile measuring equipment since plenty of the kernel mathematical algorithms are based on matrix multiplication. In this paper, we propose a matrix multiplication specially optimized for ARMv7 architecture. The performance-critical differences between ARMv7 and conventional des...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Softw., Pract. Exper.
دوره 48 شماره
صفحات -
تاریخ انتشار 2018